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Abstract. In this pedagogical lecture, I introduce some of the basic termi- 
nology and description of fluctuating fields as they occur on cosmology. I 
define various statistical, cosmological and sample homogeneity and explain 
what is meant by the fair sample hypothesis and cosmic variance. I illustrate 
these concepts using the simplest second-order statistics, i.e. the two-point 
correlation function and its Fourier transform the power-spectrum. I then 
give a brief overview of the properties of information relating to the proper- 
ties of the phases of the Fourier modes of cosmological fluctuations which is 
not contained in these simpler statistics. Specifically, I explain how phase in- 
formation of a particular form (called quadratic phase coupling) is encoded 
in the three-point correlation function (or, equivalently, the bispectrum). 



1. Introduction 

In most popular versions of the gravitational instability model for the ori- 
gin of cosmic structure, particularly those involving cosmic inflation (Guth 
1981; Guth & Pi 1982), the initial fluctuations that seeded the structure 
formation process form a Gaussian random field (Adler 1981; Bardeen et 
al. 1986). Gaussian random fields are the simplest fully-defined stochastic 
processes, which makes analysis of them relatively straightforward. Robust 
and powerful statistical descriptors can be constructed that have a firm 
mathematical underpinning and are relatively simple to implement. Second- 
order statistics such as the ubiquitous power-spectrum (e.g. Peacock h 
Dodds 1996) furnish a complete description of Gaussian fields. They have 
consequently yielded invaluable insights into the behaviour of large-scale 
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structure in the latest generation of redshift surveys, such as the 2dFGRS 
(Percival et al. 2001). Important though these methods undoubtedly are, 
the era of precision cosmology we arc now entering requires more thought 
to be given to methods for both detecting and exploiting departures from 
Gaussian behaviour. 

The pressing need for statistics appropriate to the analysis of non-linear 
stochastic processes also suggests a need to revisit some of the fundamental 
properties cosmologists usually assume when studying samples of the Uni- 
verse. Gaussian random fields have many useful properties. It is straightfor- 
ward to impose constraints that result in statistically homogeneous fields, 
for example. Perhaps more relevantly one can understand the conditions 
under which averages over a single spatial domain are well-defined, the con- 
straint of sample-homogeneity. The conditions under which such fields can 
be ergodic are also well established. It is known that smoothing Gaussian 
fields preserves Gaussianity, and so on. These properties are all somewhat 
related, but not identical. Indeed, looking at the corresponding properties 
of non-linear fields turns up some interesting results and delivers warnings 
to be careful. Exploring these properties is the first aim of this lecture. 

Even if the primordial density fluctuations were indeed Gaussian, the 
later stages of gravitational clustering must induce some form of non- 
linearity. One particular way of looking at this issue is to study the be- 
haviour of Fourier modes of the cosmological density field. If the hypothesis 
of primordial Gaussianity is correct then these modes began with random 
spatial phases. In the early stages of evolution, the plane-wave components 
of the density evolve independently like linear waves on the surface of deep 
water. As the structures grow in mass, they interact with other in non-linear 
ways, more like waves breaking in shallow water. These mode-mode interac- 
tions lead to the generation of coupled phases. While the Fourier phases of a 
Gaussian field contain no information (they are random) , non-linearity gen- 
erates non-random phases that contain much information about the spatial 
pattern of the fluctuations. Although the significance of phase information 
in cosmology is still not fully understood, there have been a number of 
attempts to gain quantitative insight into the behaviour of phases in gravi- 
tational systems. Ryden & Gramann (1991), Soda & Suto (1992) and Jain 
& Bertschinger (1998) concentrated on the evolution of phase shifts for in- 
dividual modes using perturbation theory and numerical simulations. An 
alternative approach was adopted by Scherrer, Mclott &; Shandarin (1991), 
who developed a practical method for measuring the phase coupling in 
random fields that could be applied to real data. Most recently Chiang &; 
Coles (2000), Coles & Chiang (2000), Chiang (2001) and Chiang, Naselsky 
& Coles (2002) have explored the evolution of phase information in some 
detail. 
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Despite this recent progress, there is still no clear understanding of how 
the behaviour of the Fourier phases manifests itself in more orthodox statis- 
tical descriptors. In particular there is much interest in the usefulness of the 
simplest possible generalisation of the (second-order) power-spectrum, i.e. 
the (third-order) bispectrum (Peebles 1980; Scoccimarro et al. 1998; Scoc- 
cimarro, Couchman & Frieman 1999; Verde et al. 2000; Verde et al. 2001; 
Verde et al. 2002). Since the bispectrum is identically zero for a Gaus- 
sian random field, it is generally accepted that the bispectrum encodes 
some form of phase information but it has never been elucidated exactly 
what form of correlation it measures. Further possible generalisations of the 
bispectrum are usually called polyspectra; they include the (fourth-order) 
trispectrum (Verde & Heavens 2001) or a related but simpler statistic called 
the second-spectrum (Stirling & Peacock 1996). Exploring the connection 
between polyspectra and non-linearly induced phase association is the sec- 
ond aim of this lecture. 

The plan is as follows. In the following section I introduce some fun- 
damental concepts underlying statistical cosmology, more-or-lcss from first 
principles. I do this in order to allow the reader to see explicitly what as- 
sumptions underlie standard statistical practise. In Section 3 I look at some 
of the contexts in which quadratic non-linearity may arise, either primor- 
dially or during the non-linear growth of structure from Gaussian fields. In 
Section 4 I revisit some of the basic properties used in Section 2 from the 
viewpoint of a particularly simple form of non-linearity, known as quadratic 
non-linearity, and show how some basic implicit assumptions may be vio- 
lated. I then, in Section 5, explore how phase correlations arise in quadratic 
fields and relate these to higher-order statistics of quadratic fields. 

2. Basic Statistical Concepts 

I start by giving some general definitions of concepts which I will later use 
in relation to the particular case of cosmological density fields. In order to 
put our results in a clear context, I develop the basic statistical description 
of cosmological density fields; see also, e.g., Peebles (1980) and Coles &; 
Lucchin (2002). 

2.1. FOURIER DESCRIPTION 

I follow standard practice and consider a region of the Universe having 
volume Vu, for convenience assumed to be a cube of side L ^ Ig, where 
Is is the maximum scale at which there is significant structure due to the 
perturbations. The region Vu can be thought of as a "fair sample" of the 
Universe if this is the case. It is possible to construct, formally, a "realisa- 
tion" of the Universe by dividing it into cells of volume Vu with periodic 
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boundary conditions at the faces of each cube. This device is often conve- 
nient, but in any case one often takes the hmit Vu ^ oc. Let us denote by 
p the mean density in a volume and take p{x.) to be the density at a 
point in this region specified by the position vector x with respect to some 
arbitrary origin. As usual, the fluctuation is defined to be 

<5(x) = [p(x)-p-]/p. (1) 

We assume this to be expressible as a Fourier series: 

= XI '^k exp(ik • x) = ^ 5k exp(-zk • x); (2) 

k k 

the appropriate inverse relationship is of the form 

(5k = 77- / (5(x) exp(— ik • x)dx. (3) 

The Fourier coefficients are complex quantities, 

4 = |5k|exp(i^k) (4) 

with amplitude \5\s\ and phase ^k- The assumption of periodic bound- 
aries results in a discrete k-space representation; the sum is taken from 
the Nyquist frequency /cNy = 27r/L, where 14 = L^, to infinity. Note that 
as L — > CO, ^Ny 0. Conservation of mass in Vu implies 5k=o = and the 
reality of (5(x) requires 5^ = 5-k- 

If, instead of the volume V^, we had chosen a different volume the 
perturbation within the new volume would again be represented by a series 
of the form (2), but with different coefficients (5k. Now consider a (large) 
number N of realisations of our periodic volume and label these realisa- 
tions by Vui, Vu2, Vu3, VuN- It is meaningful to consider the probability 
distribution ■P((5k) of the relevant coefficients (5k from realisation to reali- 
sation across this ensemble. One typically assumes that the distribution is 
statistically homogeneous and isotropic, in order to satisfy the Cosmologi- 
cal Principle, and that the real and imaginary parts of (5k have a Gaussian 
distribution and are mutually independent, so that 

where w stands for either Re [(5k] or Im [(5k] and a| = cr|/2; is the 
spectrum. This is the same as the assumption that the phases ^k in equation 
(5) are mutually independent and randomly distributed over the interval 
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between = and ^ = 27r. In this case the moduh of the Fourier amphtudes 
have a Rayleigh distribution: 

r(\5^\A^.)d\5^,\d4>^. = exp(-M^)d|(^k|#k. (6) 

Because of the assumption of statistical homogeneity and isotropy, the 
quantity V{b\^ depends only on the modulus of the wavevector k and not 
on its direction. It is fairly simple to show that, if the Fourier quantities 
|(5k| have the Rayleigh distribution, then the probability distribution V{S) 
oi S = (5(x) in real space is Gaussian, so that: 

where cr^ is the variance of the density field 5(x). This is a strict defini- 
tion of Gaussianity. However, Gaussian statistics do not always require the 
distribution (7) for the Fourier component amplitudes. According to its 
Fourier expansion, (5(x) is simply a sum over a large number of Fourier 
modes whose amplitudes are drawn from sonic distribution. If the phases 
of each of these modes are random, then the Central Limit Theorem will 
guarantee that the resulting superposition will be close to a Gaussian if 
the number of modes is large and the distribution of amplitudes has finite 
variance. Such fields are called weakly Gaussian. 



2.2. COVARIANCE FUNCTIONS & PROBABILITY DENSITIES 

1 now discuss the real-space statistical properties of spatial perturbations 
in p. The covariance function is defined in terms of the density fluctuation 

by 

^r) = M-)-^1[^_C>- + ^)-^1) = + r)). (8) 

The angle brackets in this expression indicate two levels of averaging: first 
a volume average over a representative patch of the universe and second 
an average over different patches within the ensemble, in the manner of 
§2.1. Applying the Fourier machinery to equation (8) one arrives at the 
Wiener-Khintchin theorem, relating the covariance to the spectral density 
function or power spectrum, P{k): 

C(r) = $^(|<5kP)expHk.r), (9) 

k 

which, in passing to the limit Vu — > oo, becomes 

e(r) = ^ / Pik) exp(-zk ■ r)dk. (10) 
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Averaging equation (9) over r gives 

(e«)r = ^ T^iM I exp(-ik . r)dr = 0. (11) 

The function ^(r) is the two-point covariance function. In an analogous 
manner it is possible to define spatial covariance functions for N > 2 points. 
For example, the three-point covariance function is 

^) ^ ([P(x) - p][p(x + r) - p][p(x + s) - p]) ^^2) 

which gives 

C(r,s) = (5(x)5(x + r)<5(x + s)), (13) 

where the spatial average is taken over all the points x and over all direc- 
tions of r and s such that |r — s| = t: in other words, over all points defining 
a triangle with sides r, s and t. The generalisation of (12) to A?" > 3 is 
obvious. 

The covariance functions are related to the moments of the probabil- 
ity distributions of 5(x). If the fluctuations form a Gaussian random field 
then the N-variate distributions of the set 5i = 6{xi) are just multivariate 
Gaussians of the form 

The correlation matrix Cij can be expressed in terms of the covariance 
function 

Cij = {SiSj) = a^ij). (15) 

It is convenient to go a stage further and define the N-point connected co- 
variance functions as the part of the average {Si-.-Sn) that is not expressible 
in terms of lower order functions e.g. 

{S1S2S3) = (Si) 0(6263) c+{S2)c{SlS3)c+ {S3) c{Sld2)c+ (Si) c{S2)c{S3)c+ {616263)0, 

(16) 

where the connected parts are {616263)0, {6162)0, etc. Since {6) = by con- 
struction, {61)0 = {61) = 0. Moreover, {6162)0 = {6162) and {616263)0 = 
{616263). The second and third order connected parts are simply the same 
as the covariance functions. Fourth and higher order quantities are different, 
however. The connected functions are just the multivariate generalisation 
of the cumulants kn (Kendall &; Stewart 1977). One of the most important 
properties of Gaussian fields is that all of their N-point connected covari- 
ances are zero beyond N=2, so that their statistical properties are fixed once 
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the set of two-point covariances (15) is determined. All large-scale statis- 
tical properties are therefore determined by the asymptotic behaviour of 
^(r) as r ^ oo. This simplifying property is not shared by non-Gaussian 
fields, a fact we shall explore in Section 4. 

3. Quadratic Non-Linearity 

In this section I discuss some of the circumstances wherein quadratic non- 
linearity may arise. At the outset I should admit that this study is primarily 
phenomenological and our intention is largely to use this as a model that 
displays some consequences of non-linearity. 

3.1. PHENOMENOLOGY 

As far as I am aware, the first application of quadratic density fields in a 
cosmological setting was in Coles & Barrow (1987) who were studying the 
possible sample properties of non-Gaussian temperature fluctuations on the 
cosmic microwave background sky. They in fact studied a series of models 
called Xn models obtained via the transformation 

Y = Xf + Xi + ...Xl (17) 

where n is the order and the Xi are independent Gaussian random fields 

with identical covariance functions. Similar models were also explored by 
Moscardini et al. (1991) using Y to model either the primordial density field 
or the primordial gravitation potential; see also (Koyoma, Soda &; Taruya 
1999; Verde ct al. 2000; Matarrese, Verde & Jimenez 2000; Verde et al. 
2001; Komatsu &: Spergel 2001). The case n = 1 is the quadratic model of 
the present paper. There was no physical motivation for this as a model 
of CMB temperature fluctuations; it was used simply because one could 
calculate analytic results for such a field to compare with similar results for 
a Gaussian. Indeed the Xn random field has been used as a model for non- 
Gaussian phenomena in a wide range of fields, including surface physics 
and geology (e.g. Adler 1981). 

3.2. INFLATION 

Since the early 1980s (e.g. Guth &: Pi 1982) it has been commonly be- 
lieved that the infiationary scenario of the very early Universe results in 
the imprint of primordial Gaussian fluctuations. Since then, and with the 
invention of increasingly complicated models of the inflationary process, 
it has become clearer that inflation can produce significant levels of non- 
Gaussianity. The simplest "slow-roll" models of inflation involving a single 
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dominant scalar field do indeed produce Gaussian fluctuations, but includ- 
ing non-linear terms and back-reaction does produce some element of non- 
Gaussianity as do extra degrees of freedom (Salopek & Bond 1990; Salopek 
1992; Falk, Rangarajan & Srednicki 1993; Gangui et al. 1994; Wang & 
Kamionkowski 2000; Bartolo, Matarrese & Riotto 2002). 

Typically the non-linear contributions arising during inflation manifest 
themselves as higher-order contributions to the effective Newtonian gravi- 
tational potential, i.e. 

^ = cl) + a{4>''-{4>'')) + ..., (18) 

where (/> is a Gaussian fleld (not the phase) and a is a constant which is 
vanishingly small in most models. The term in {(fp') is needed to ensure 
$ has zero mean. Since the mean value of the Newtonian potential is not 
physically meaningful anyway this term is not really important. A more 
radical suggestion for an inflation-induced quadratic model is offered by 
Peebles (1999a,b). In this model the density field is given by 

p(x) = ^mV(x)^ (19) 

where is a scalar fleld and m is an effective mass. I return to the statistical 
consequences of this particular model in Section 5. 

3.3. GRAVITATIONAL NON-LINEARITY 

In a simple perturbative model, the non-linear density contrast at a point 
r can be modelled by the relation 

(5(x) = (5i(x) + e<52(x) (20) 

where (^i(x) is a Gaussian random field, 52(x) is a quadratic random field 
derived by squaring 5\ and e is a small factor that controls the degree of 
non-linearity. To be precise I should also include a constant term in this 
expression in order to ensure that ((5(x)) = 0, but this does not play any role 
in the following for reasons mentioned above so I ignore it. Using a constant 
e is not rigorous but at least qualitatively it shows how the lowest non-linear 
corrections come into play. For a detailed discussion of perturbation theory 
done properly see Bernardeau et al. (2002). 

3.4. BIAS 

Attempts to confront theories of cosmological structure formation with ob- 
servations of galaxy clustering are complicated by the uncertain and possi- 
bly biased relationship between galaxies and the distribution of gravitating 
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matter. A particular simple and useful way of modelling this relationship is 
through the idea of a local bias. In such models, the propensity of a galaxy 
to form at a point where the total (local) density of matter is p is taken 
to be some function f{p) (Coles 1993; Fry k. Gaztanaga 1993). This boils 
down to a statement of the form 

5,(x) = /[<5(x)], (21) 

where 8g is the density contrast inferred from galaxy counts or other clus- 
tering statistics. In the simplest local bias models, f{6) is a constant usually 
called h. Clearly a linear bias of this form simply scales the variance, co- 
variance functions and power-spectra of the underlying field but has no 
effect on the detailed form of the statistical distribution. Models where the 
bias is non-linear (but still local) are useful as they subject constraints on 
the effect that the bias may have on galaxy clustering statistics, without 
making any particular assumption about the form of / (Coles 1993). Fry 
k, Gaztanaga (1993) discussed the implications of bias with the form 

CO 

/(<5) = ^6„<5", (22) 

n=0 

in which the 6„ cannot all be chosen independently because the mean of 
5g must again be zero. On scale where the density field is linear, one can 
therefore see that a non-linear bias with 62 7^ will result in quadratic 
contributions to 5g even if they do not contribute significantly to 5. 

4. Asymptotic Properties 

In developing the statistical background in Section 2, especially the Fourier 
description of random fluctuating fields, I made a number of assumptions 
along the way that were necessary in order for the resulting descriptors to 
be well-defined. The terminology relating to these assumptions is often used 
very loosely in cosmology, at least partly because they bear a relatively sim- 
ple relationship to each other under when the fields one is dealing with are 
Gaussian. However, in the general case of non-Gaussian fields many sub- 
tleties arise relating to the presence of higher-order statistical correlations. 
It is especially important in the current era of high-precision developments 
in statistical cosmology also to be precise about the foundations. 

In the following I discuss some of the large-scale properties of random 
fields in a formal fashion, with particular reference to the quadratic model. 
The behaviour of covariance functions on large-scales will turn out to be 
very important so, to take the simplest example, consider the Peebles model 
from Section 3.2. In this case we basically have a quadratic density field 
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5 = ijP'— < ■ip'^ > where tp is assumed to be a statistically homogeneous 
random field and I have subtracted the mean value of ijj^ to ensure that S 
has zero mean. 

Let us suppose that has a well-behaved covariance function r(r) and 
that the covariance function of 5 is, as usual, .^(r). It is trivial to show that 
that 

C(r) = 2r2(r) (23) 

so that ^(r) must be positive for all r (Adler 1981). Note that adding 
constant terms to 6 would not alter this behaviour. In the more general 
case of a field of the form 6 = ip + atp'^, such as the examples given in 
equations (16) Sz (18), the resulting covariance function has a behaviour of 
the form 

e(r) =r(r) + a2r2(r). (24) 

In this case the covariance function of the resulting field would contain 
terms of order r(r), the corresponding covariance function of the underlying 
Gaussian field. If r(r) < on some scale then as long as a is small, the 
resulting ^(r) need not be positive in this case. 

4.1. STATISTICAL HOMOGENEITY 

The formal definition of strict statistical homogeneity for a random field 
(also called stationarity) is that the set of finite-dimensional joint proba- 
bility distributions, which I called VNi^i, ■■■■,5^) in Section 2.3, must be 
invariant under spatial translations, i.e. 

P^((5(xi), <5(xjv)) = Pjv(5(xi + x), <5(x^ + x)) (25) 

for any x. This must be true for all orders N. For a Gaussian random field 
in which the form of Vn{^i, ■■■,^n) is given by equation (23), necessary 
and sufficient conditions for 6{x) to be strictly homogeneous is that the 
covariance function (5(xi)(5(x2)) is a function of xi — X2 only (Adler 1981). 
Statistical isotropy can be added by requiring rotation-invar iance. One can 
define weaker versions of homogeneity and isotropy according to which 
only the moments of the distribution be translation invariant. For example, 
second-order homogeneity and isotropy (all that is required for the analysis 
of power-spectra or two-point covariance functions) basically means that 
the function ^(r) does not depend on either the origin or the direction of r, 
but only on its modulus. Since the properties of a Gaussian random field 
depend only on second-order properties, this weaker condition is sufficient 
to require condition (25) in this case. 

This does not mean that any function satisfying translation and rotation 
invariance is necessarily the covariance function of a homogeneous random 
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field (in either the strict or second-order sense). For one thing the power 
spectrum must be positive (or zero) for all k, which places a constraint on 
the shape of any possible ^(r) - which must be convex. The result (16) also 
implies that 



for such fields. A perfectly homogeneous distribution would have P{k) = 
and ^(r) would be identically zero for all r. Note, however, that it is possible 
for fields obeying either (23) or (24) to be statistically homogeneous. 

4.2. SAMPLE HOMOGENEITY 

Statistical homogeneity plays a vital role in both the analysis of cluster- 
ing and the formal development of the theory of cosmological perturbation 
growth. Unfortunately the use of the word "homogeneity" in this context 
leads to a confusion regarding the more fundamental use of this word in 
cosmology. Standard cosmologies arc based on the Cosmological Princi- 
ple, which requires our Universe to be homogeneous and isotropic on large 
scales. More loosely, it needs to be sufficiently homogeneous and isotropic 
that the Robertson- Walker metric and the Fricdmann equations furnish 
an adequate approximation to the evolution of the Universe. Statistical 
homogeneity as described above is a much weaker requirement than the re- 
quirement that one realisation from a probability ensemble (our Universe) 
has asymptotically small fluctuations when smoothed on a sufficiently large 
scale. In fact, the analogous relation to (26) in the case of sample homo- 
geneity is far stronger: 



as this requires the real fluctuations in density to be asymptotically small 
within a single realisation. Notice that the requirement for sample homo- 
geneity is such that in general the covariance function must change sign, 
from positive at the origin, where ^(0) = cr^ > 0, to negative at some r to 
make the overall integral (26) converge in the correct way. 

It is clear from this discussion that statistical homogeneity does not 
require sample homogeneity. Revisiting the quadratic model now reveals 
another interesting point: the Peebles model (23) can not be sample ho- 
mogeneous, even if it is statistically homogeneous. If we want 5 to have a 
covariance function matching observations, say ^(r) ~ r~^, then the under- 
lying Gaussian field must have T{r) ~ which violates the constraint for 
it to be sample homogeneous. 

This model behaves in a similar way to fractal models of the type dis- 
cussed, for example, by Coleman & Pietronero (1992). Mention of fractal 




(26) 




(27) 
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models tends to send mainstream cosmologists screaming into the hills, but 
the lack of sample homogeneity they describe need not be damaging to stan- 
dard methods. To give a prominent illustration, consider the behaviour of 
the gravitational potential field defined by a Gaussian random field with the 
Harrison-Zel'dovich spectrum. In such a case, the fluctuations will always 
be small (of order 10^"'' to be consistent with observations) but they are 
independent of scale, and thus there is never a scale at which sample homo- 
geneity is exactly reached. It is not particularly important for the purposes 
of galaxy clustering studies that the universe obeys the property of sample 
homogeneity. What is more important is estimates of statistical properties 
obtained from diflFerent samples vary with respect to the ensemble-averaged 
property in a fashion which is under control for large samples. This does 
not require asymptotic convergence to homogeneity. 

Finally, note that the general quadratic model (24) can be sample ho- 
mogeneous if r(r) obeys condition (27) and a is sufficiently small. Pertur- 
bative corrections, such as those described in Section 3.3, do not therefore 
necessarily induce sample inhomogeneity. 

4.3. ERGODICITY AND FAIR SAMPLES 

I have already introduced the idea of a "fair sample hypothesis" , which is 
basically that averages over finite patches of the Universe can be treated 
as averages over some probability ensemble. Peebles (1980), for example, 
gives the definition of a fair sample hypothesis in a number of ways. First, 

he states 

"..the fair sample hypothesis is taken to mean that the universe is sta- 
tistically homogeneous and isotropic." 

Later we find 

"Samples from well separated spots are uncorrelated, and the collection 
of such samples is a statistical ensemble generated by many independent 
applications ..." 

This second definition is close to the one I used in Section 2, but it is clear 
that it is stronger than the first one. Related to the fair sample hypothesis, 
but not identical to it, is the so-called ergodic property, which is that av- 
erages over an infinite domain within a single realization can be treated as 
averages over the probability ensemble. The second definition of a fair sam- 
ple is a stronger statement than the ergodic property, since it involves the 
properties of finite patches rather than an infinite domain within a single 
realisation from the probability ensemble. 

Ergodic properties are extremely difficult to prove, but results do ex- 
ist for Gaussian random fields (Adler 1981). Intriguingly, in this case the 
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result is extremely simple. The necessary and sufficient condition for a 
statistically homogeneous Gaussian random field to be ergodic is that its 
power-spectrum (defined above) should be continuous. Continuity of the 
power-spectrum leads, by standard Fourier analysis, to the result that 



This requires the covariance function to be decreasing. In fact, any statis- 
tically homogeneous Gaussian field will be ergodic if ^(r) ^ as r — > oo. 
Notice then that a Gaussian random field can be ergodic without being 
sample homogeneous. 

A general form of this ergodic theorem does not exist for arbitrary 
non-Gaussian random fields, but fortunately this does not matter. What is 
needed for statistical cosmology is not an ergodic property but something 
closer to a version of the fair sample hypothesis. 

Suppose instead we have a sample corresponding to part of one real- 
ization that covers a finite spatial domain D. Suppose we extract some 
statistic Q d from this sample. What we need from a fair sample hypothesis 



in other words that the estimate obtained from a finite sample is within 
some acceptable margin of an ensemble-averaged statistic. What margin 
we would accept is up to us to decide. In any case, the ergodic property 
does not require the fair-sample property, as it involves averages over in- 
finite domains of a single realisation. The fair sample hypothesis does not 
require ergodicity, either. If the sample estimates is within the acceptable 
tolerance at some scale D then we do not require the departure to reduce 
asymptotically any further. To return to the Harrison-Zel'dovich spectrum 
mentioned in Section 4.2, note that fluctuations in density on the scale of 
the horizon are always of order 10~^. We might estimate the global value of 
Q from any finite volume and get an estimate which is within 10~^ of the 
global value, but the estimate does not improve in accuracy by sampling 
larger volumes. 

Prom this we can conclude that the ergodic property is irrelevant and 
use of this term should be avoided. There is, however, one particularly neat 
relationship between ergodicity and statistical homogeneity for the Peebles 
model. Notice if we take 6 oa ip"^ and require ^ to be a Gaussian random 
field with covariance function ^(r) then the covariance function of 6 is just 
^^(r), exactly the form that appears in equation (28). If we take to be an 
ergodic Gaussian random field then this guarantees the resulting quadratic 
field must be at least second-order statistically homogeneous. 




(28) 



is that 



Qd ^ (Q), 



(29) 
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4.4. SAMPLE VARIANCE AND COSMIC VARIANCE 

It is worth at this stage briefly mentioning a couple of the consequences of 
the unavailability of infinite sampling domains. Suppose we Fourier trans- 
form the density field within a finite box using the prescription given in 
Section 2.1. For small values of k, say kg, there will be very few modes in 
the box, so the estimate of the power spectrum at these wavenumbers will 
be subject to a large uncertainty, which we can call the sampling variance. 
If we take a larger box, more modes at wavenumber kg fit into the box and 
the sampling variance consequently reduces. 

This form of uncertainty should be distinguished from so-called "cos- 
mic variance" which is perhaps easier to understand in the framework of 
temperature fluctuations in the cosmic microwave background. These are 
described in terms of a spherical harmonic expansion of the form 

= £ ' aimYUe, ct>) (30) 

1=0 m=-l 

rather than a Fourier series. Notice that the low I modes, such as the 
quadrupole (l = 2) have only a small number of independent aim so esti- 
mates of the (angular) power-spectrum at low I are uncertain even if the 
whole sky were available. Nothing can be done to reduce this uncertainty, 
so it is called "cosmic" variance. Of course similar considerations to those 
discussed above apply when a only patch of the sky is available so temper- 
ature maps may have sampling variance too, but cosmic variance is a term 
that refers to an irreducible source of uncertainty. 

4.5. ASYMPTOTIC INDEPENDENCE AND SMOOTHING 

The considerations we have discussed above generally lead to requirements 

that the correlations between points become small as the separation be- 
tween the points grows large. For a Gaussian field, if ^(r) ^ as r — > oo 
then the probability distributions tend to an asymptotically independent 
form. This must be the case because such a field contains only second- 
order correlations. For example, in the limit that the correlation matrix Cij 
becomes diagonal, the 2-point Gaussian probability density 7^2 ((^i, '^2) — 
'Pi{Si)'Pi{S2), consistent with the requirement of independence i.e. P{A, B) = 
P{A)P(B). Similar results will hold for higher order A'^-point distributions. 
This result means that for Gaussian fields absence of (second-order) corre- 
lation, i.e. 



(X1X2) = (Xi)(X2), 



(31) 
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means independence. Lack of correlation only requires full independence for 
Gaussian fields. Independence always implies lack of correlation whether the 
field is Gaussian or not. 

We can see then that even though a non-Gaussian field, such as a 
quadratic model, may be uncorrelated on large scales, consistent with the 
requirement s aliove, this does not necessarily mean that points are asymp- 
totically independent. 

The reason for discussing this is that it is very relevant to what hap- 
pens to a random field as it is smoothed on successively larger scales. This 
smoothing is equivalent to filtering the field with a low pass filter. The fil- 
tered field, 5(x; Rf), may be obtained by convolution of the "raw" density 
field with some function F having a characteristic scale Rf. 

(5(x;i?/) = J S{:x.')F{\yi-:x!\;Rf)dx.'. (32) 

The filter F has the following properties: F = constant ~ RJ^ if |x — x'| ^ 
i?/, F ~ if |x - x'l > Rf, J F(y; Rf)dy = 1. 

If the underlying density field is Gaussian then the filtered field will also 
be Gaussian. This is a result of the fact that filtering essentially constructs 
a weighted average of the underlying field and any sum of Gaussian vari- 
ates is itself a Gaussian variate (e.g. Kendall & Stuart 1977). According 
to the Central Limit Theorem, the sum of a large number of independent 
variates drawn from a distribution with finite variance also tends to a Gaus- 
sian distribution. One would imagine, therefore, that if distant points were 
asymptotically independent then the effect of filtering on a non-Gaussian 
field is to "gaussianize" it. In fact this is assumed in standard statistical 
cosmology. We know the small-scale distribution is non-Gaussian, but av- 
eraging over sufficiently large smoothing windows is assumed to recover the 
linear field, or something close to it. But for a general non-Gaussian field 
how quickly do points have to become independent in order for filters to 
Gaussianize the distribution? 

The answer to this question is given by Fan &; Bardeen (1995): it de- 
pends on a function called the Rosenblatt dependence (or "mixing") rate 
which governs the rate at which the maximum value of \P{AB)—P{A)P(B)\ 
tends to zero at large separations {A and B are any combination of values 
of Si in different locations). The rate at which asymptotic independence is 
reached is called the mixing rate. This is quite a technical issue, and we 
leave the details to Fan &; Bardeen (1995). On the other hand when the 
field in question is a local transformation of a Gaussian random field (such 
as in the quadratic model) then there is a simple result for the mixing rate, 
namely that if the covariance function of the underlying field falls off as a 
power, i.e. as 1/r^ as r — > oo, then q > 3. This is sometimes referred to as 
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the requirement for pseudo-Markov behaviour (Adler 1981). If this criterion 
is satisfied then all local transformations of the underlying field satisfy the 
mixing rate condition and consequently become Gaussian if smoothed on 
sufficiently large scales. 

Let us look at this issue in the light of the Peebles model. Peebles 
(1999b) shows that this model is fully self-similar. When smoothed on larger 
and larger scales the distribution function does not tend to a Gaussian but 
retains the same (x^) form. At first sight this looks extremely surprising. 
Suppose we imagine a simple version of a random field built upon dis- 
crete cells of some size R. Suppose the field were uniform within each cell 
but that adjacent cells were generated independently from the distri- 
bution. Filtering on a scale -R/ in this case simply corresponds to adding 
neighbouring (uncorrelated) cells. This would produce something like a sum 
of = (Rf/R)^ independent values from a quadratic model. This would 
produce a resulting field of the form (17). As Rf becomes larger, N in- 
creases and the resulting distribution changes to a distribution of Xn of 
larger and larger n. It is well known that, as n ^ oo, the distribution of 
is asymptotically close to the Gaussian as expected from the central limit 
theorem. 

In the case of the Peebles model, however, the mixing rate condition 
is not satisfied. Although distant points are asymptotically independent, 

the rate at which they tend to independence is not sufficient to produce 
a Gaussian distribution. Regardless of the scale of smoothing, as long as 
the covariance function is chosen to be scale-free, the density field retains 
a distribution having the same shape. This demonstrates that this model 
is, in fact, a kind of fractal as we discussed above. 

Notice that in the more normal case where we take the quadratic con- 
tribution to represent only the non-linear effect on initially Gaussian fluc- 
tuations then it is guaranteed to satisfy the mixing rate condition as long 
as the Gaussian field does that generates it. It is therefore justified to as- 
sume that the model described in Section 3.3 does become Gaussian when 
smoothed on large scales. 

5. Quadratic Phase Coupling 

In §2 we pointed out that a convenient definition of a Gaussian field could be 
made in terms of its Fourier phases, which should by independent and uni- 
formly distributed on the interval [0, 2-k]. A breakdown of these conditions, 
such as the correlation of phases of different wavemodes, is a signature that 
the field has become non-Gaussian. In terms of cosmic large-scale structure 
formation, non-Gaussian evolution of the density field is symptomatic of 
the onset of non-linearity in the gravitational collapse process, suggesting 
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Figure 1. Numerical simulation of galaxy clustering (left) together with a version gen- 
erated randomly reshuffling the phases between Fourier modes of the original picture 
(right). 



that phase evolution and non-hnear evolution are closely hnked. A relatively 
simple picture emerges for models where the primordial density fluctuations 
are Gaussian and the initial phase distribution is uniform. When pertur- 
bations remain small evolution proceeds linearly, individual modes grow 
independently and the original random phase distribution is preserved. 
However, as perturbations grow large their evolution becomes non-linear 
and Fourier modes of different wavenumber begin to couple together. This 
gives rise to phase association and consequently to non-Gaussianity. It is 
clear that phase associations of this type should be related in some way to 
the existence of the higher order connected covariance functions, which are 
traditionally associated with non-linearity and are non-zero only for non- 
Gaussian fields. In this sections such a relationship is explored in detail 
using an analytical model for the non-linearly evolving density fluctuation 
field. Phase correlations of a particular form are identified and their con- 
nection to the covariance functions is established. 

A graphic demonstration of the importance of phases in patterns gen- 
erally is given in Figure 1. Since the amplitude of each Fourier mode is un- 
changed in the phase reshuffling operation, these two pictures have exactly 
the same power-spectrum, P{k) oc |5(k)p. In fact, they have more than 
that: they have exactly the same amplitudes for all k. They also have totally 
different morphology. Further demonstrations of the importance of Fourier 
phases in defining clustering morphology are given by Chiang (2001). 
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5.1. QUADRATIC DENSITY FIELDS 

It is useful at this stage to a particular form of non-Gaussian field that 
serves both as a kind of phenomenological paradigm and as a reasonably 
realistic model of non-linear evolution from Gaussian initial conditions. The 
model involves a field which is generated by a simple quadratic transfor- 
mation of a Gaussian distribution, hence the term quadratic non-linearity. 
Quadratic fields have been discussed before from a number of contexts (e.g. 
Coles & Barrow 1987; Moscardini ct al. 1991; Falk, Rangarajan & Srednicki 
1993; Luo & Schramm 1993; Luc 1994; Gangui ct al. 1994; Koyoma, Soda 
&; Taruya 1999; Peebles 1999a,b; Matarrese, Verde & Jimenez 2000; Verde 
et al. 2000; Verde et al. 2001; Komatsu & Spergel 2001; Shandarin 2002; 
Bartolo, Matarrese & Riotto 2002); for further discussion see below. The 
motivation is very similar to that of Coles &; Jones (1991), which introduced 
the lognormal density field as an illustration of some of the consequences 
of a more extreme form of non-linearity involving an exponential transfor- 
mation of the linear density field. 

5.2. A SIMPLE NON-LINEAR MODEL 

We adopt the simple perturbative expansion of equation (20) in order to 
model the non-linear evolution of the density field. Although the equivalent 
transformation in formal Eulerian perturbation theory is a good deal more 
complicated, the kind of phase associations that we will deal with here are 
precisely the same in either case. In terms of the Fourier modes, in the 
continuum limit, we have for the first order Gaussian term 



<52(x) = [5i(x)]2 = j (fk(fk' |(5k||(5k'| exp [i(<^k + M] exp [z(k + k') • r]. 

(34) 



The quadratic field, 62, illustrates the idea of mode coupling associated with 
non-linear evolution. The non-linear field depends on a specific harmonic 
relationship between the wavenumber and phase of the modes at k and k'. 
This relationship between the phases in the non-linear field, i.e. 



Si{x) = d^k \S]i \ exp [i^k] exp [ik • x] 



(33) 



and for the second-order perturbation 



(35) 



where the RHS represents the phase of the non-linear field, is termed 
quadratic phase coupling. 
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5.3. THE TWO-POINT COVARIANCE FUNCTION 

The two-point covariance function can be calculated using the definitions 
of §2, namely 

e(r) = (5(x)5(x + r)). (36) 

Substituting the non-linear transform for 5(x) (equation 20) into this ex- 
pression gives four terms 

e(r) = (5i(x)5i(x+r))+e((^i(x)J2(x+r))+e((^2(x)(5i(x+r))+e2(52(x)52(x+r)). 

(37) 

The first of these terms is the linear contribution to the covariance function 
whereas the remaining three give the non-linear corrections. We shall focus 
on the lowest order term for now. 

As we outlined in Section 2, the angle brackets () in these expressions 
are expectation values, formally denoting an average over the probability 
distribution of 5(x). Under the fair sample hypothesis we replace the expec- 
tation values in equation (36) with averages over a selection of independent 
volumes so that () — > QyQi j-eal- '^^^ ^^^^ average is simply a volume inte- 
gral over a sufficiently large patch of the universe. The second average is 
over various realisations of the 6k and (pk in the different patches. Applying 
these rules to the first term of equation (37) and performing the volume 
integration gives 

^ii{r) = J d^kd^k' (|(5k||4'| exp [i{(f>k + 0k')])real ^oi^^ + k') exp [ik' ■ s], 

(38) 

where 5d is the Dirac delta function. The above expression is simplified 
given the reality condition 

<5k = ^k, (39) 
from which it is evident that the phases obey 

= mod[27r]. (40) 

Integrating equation (38) one therefore finds that 

Cii(r) = I d^k (|4P)reaiexp [-ik ■ s]. (41) 

so that the final result is independent of the phases. Indeed this is just the 
Fourier transform relation between the two-point covariance function and 
the power spectrum we derived in §2.1. 
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5.4. THE THREE-POINT COVARIANCE FUNCTION 

Using the same arguments outlined above it is possible to calculate the 
3-point connected covariance function, which is defined as 

C(r, s) = (<5(x)<5(x + r)(5(x + s))e. (42) 

Making the non-linear transform of equation (20) one finds the following 
contributions 

C(r,s) = (()i(x)(5i(x + r)5i(x + s))c + e(5i(x)(5i(x + r)(52(x + s))c 
+perms(121, 211) + e2((5i(x),52(x + r)52(x + s))c 
+perms(212, 221) + e3((52(x),52(x + r)5(x + s))c. (43) 

Again we consider first the lowest order term. Expanding in terms of the 
Fourier modes and once again replacing averages as prescribed by the fair 
sample hypothesis gives 

Ciii(r,s) = Jd^kd^k'd^k" (|5k||5k'||4"|cxp[i(<^k + 0k' +</'k")])real 
X 5c (k + k' + k") exp [ik' • r] exp [ik" • s] . (44) 

Recall that 5i is a Gaussian field so that cpi^, (p\^' and d)]^'/ arc independent 
and uniformly random on the interval [0, 27r]. Upon integration over one 
of the wavevectors the phase terms is modified so that its argument con- 
tains the sum (^^ + ^k' + ^-k-k"); oi" ^ permutation thereof. Whereas the 
reality condition of equation (39) implies a relationship between phases of 
anti-parallel wavevectors, no such conditions hold for modes linked by the 
triangular constraint imposed by the Dirac delta function. In other words, 
except for serendipity, 

0k + 0k' + 0-k-k" 7^ 0. (45) 

In fact due to the circularity of phases, the resulting sum is still just uni- 
formly random on the interval [0, 27r] if the phases are random. Upon av- 
eraging over sufficient realisations, the phase term will therefore cancel to 
zero so that the lowest order contribution to the 3-point function vanishes, 
i-c. Ciii(r, s) = 0. This is not a new result, but it does explicitly illustrate 
how the vanishing of the three-point connected covariance function arises 
in terms of the Fourier phases. 

Next consider the first non-linear contribution to the 3-point function 
given by 

Cii2(r,s) = e(5i(x)(5i(x + r)52(x + s)), (46) 
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or one of its permutations. In this case one of the arguments in the average 
is the field 62 (x) , which exhibits quadratic phase couphng of the form (35) . 
Expanding this term to the point of equation (44) using the definition (34) 
one obtains 



Once again the Dirac delta function imposes a general constraint upon 
the configuration of wavevectors. Integrating over one of the k gives k'" = 
— k — k' — k" for example, so that the wavevectors must form a closed loop. 
This general constraint however, does not specify a precise shape of loop, 
instead the remaining integrals run over all of the different possibilities. 
At this point we may constrain the problem more tightly by noting that 
most combinations of the k will contribute zero to C(ii2) • This is because of 
the circularity property of the phases and equation (45). Indeed, the only 
nonzero contributions arise where wc arc able to apply the phase relation 
obtained from the reality constraint, equation (40). In other words the 
properties of the phases dictate that the wavevectors must align in anti- 
parallel pairs: k = — k', k" = — k'" and so forth. 

There is a final constraint that must be imposed upon the k if is 
the connected 3-point covariancc function. In a graph theoretic sense, the 
general (unconnected) A^-point function ((5;^ (xi)5;2 (x2)...5i^ (x^v)) can be 
represented geometrically by a sum of tree diagrams. Each diagram consists 
of N nodes of order Zj, representing the (5/. (xj), and a number of linking 
lines denoting their correlations; see Fry (1984) or Bernardeau (1992) for 
more detailed accounts. Every node is made up of U internal points, which 
represent a factor = |5kl cxp (ii;/>k) in the Fourier expansion. According to 
the rules for constructing diagrams, linking lines may join one internal point 
to a single other, either within the same node or in an external node. The 
connected covariance functions are represented specifically by the subset 
of diagrams for which every node is linked to at least one other, leaving 
none completely isolated. This constraint implies that certain pairings of 
wavevectors do not contribute to the connected covariance function. For 
more details, see Watts & Coles (2002). 

The above constraints may be inserted into equation (47) by re-writing 
the Dirac delta function as a product over Delta functions of two argu- 
ments, appropriately normalised. There are only two allowed combinations 





(|4l!4'|l'^k"|l'^k"'|cxp [i(0k + (t>\^' +(t>yi" +^k"')])real 
x5z)(k + k' + k" + k'") 

xexp[zk'-r]exp[i(k" + k'")-s]. I 



(47) 
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of wavevectors so wc have 

Sd (k+k'+k"+k'") ^ [6d (k+k")fe (k'"+k'") +fe (k+k'")fe (k'+k")] . 

(48) 

Integrating over two of the k and using equation (40) ehminates the phase 
terms and leaves the final result 

Cii2(r,s) = Y J d^^d^^' (l'^kP|5k'P)reaiexp [ik' • r] exp [-i(k + k') • s]. 

(49) 

The existence of this quantity has therefore been shown to depend on the 
quadratic phase coupling of Fourier modes. The relationship between modes 
and the interpretation of the tree diagrams is also dictated by the properties 
of the phases. 

One may apply the same rules to the higher order terms in equation 
(43). It is immediately clear that the (122 terms arc zero because there is 
no way to eliminate the phase term exp [i{4>k + </'k' + ?^k" + 4'k'" + ^^k"")]> 
a consequence of the property equation (45). Diagrammatically this corre- 
sponds to an unpaired internal point within one of the nodes of the tree. 
The final, highest order contribution to the 3-point function is found to be 

C222(r,s) = /d^^fcd^fc'dV (|4|'|<5kf|<^k'f >real 

X exp [i(k - k') • r] exp [i(k' - k") • s], (50) 

where the phase and geometric constraints allow 12 possible combinations 
of wavevectors. 

5.5. CUBIC NON-LINEARITY AND HIGHER ORDER 

The above ideas extend simply to higher order where the non-linear field is 
represented by a perturbation series that docs not truncate at the quadratic 
term. At the next highest order for example, the series includes ^3 = 
which introduces a cubic phase coupling in the Fourier expansion. Al- 
though quadratic phase coupling is essential as a minimum requirement for 
the three point covariance function, cubic phase coupling is not the min- 
imum requirement for the next highest order covariance function. Indeed, 
quadratic coupling is sufficient to provide contributions to all of the n-point 
covariance functions due to the way the phases dictate that wave-vectors 
must arrange themselves into antiparallel pairs. For the 4-point covariance 
function the cubic term allows for a different diagrammatic representation: 
a star as opposed to a snake topology. However, in terms of the constituent 
wave-vectors, loops (as in Figure 3) contributing to the star topologies are 
a symmetric subset of those contributing to the snake topologies. 
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5.6. POWER-SPECTRUM AND BISPECTRUM 

The formal development of the relationship between covariance functions 
and power-spectra developed above suggests the usefulness of higher-order 
versions of P{k). It is clear from the arguments of Section 5.2 that a more 
convenient notation for the power-spectrum than that introduced in Section 



The connection between phases and higher-order covariance functions ob- 
tained in Section 5.3 also suggests defining higher-order polyspectra of the 
form 



(<5k'^k' . . . <5k(.)) = (27r)3p„(k, k', . . . kW),5^(k + k' + . . . k(")) (52) 



where the occurrence of the delta-function in this expression arises from 
a generalisation of the reality constraint given in equation (40); see, e.g., 
Peebles (1980). Conventionally the version of this with n = 3 produces 
the bispectrum, usually called i?(k, k',k") which has found much effective 
use in recent studies of large-scale structure (Peebles 1980; Scoccimarro 
et al. 1998; Scoccimarro, Couchman &; Frieman 1999; Verde et al. 2000; 
Verde et al. 2001; Verde et al. 2002). It is straightforward to show that the 
bispectrum is the Fourier-transform of the (reduced) three-point covariance 
function by following similar arguments as in Section 5.2; see, e.g., Peebles 



Note that the delta-function constraint requires the bispectrum to be 
zero except for /c-vcctors (k, k', k") that form a triangle in /c-spacc. From 
Section 5.3 it is clear that the bispectrum can only be non-zero when there is 
a definite relationship between the phases accompanying the modes whose 
wave-vectors form a triangle. Moreover the pattern of phase association 
necessary to produce a real and non-zero bispectrum is precisely that which 
is generated by quadratic phase association. This shows, in terms of phases, 
why it is that the leading order contributions to the bispectrum emerge 
from second-order fluctuations of a Gaussian random field. The bispectrum 
measures quadratic phase coupling. 

Three-point phase correlations have another interesting property. While 
the bispectrum is usually taken to be an ensemble-averaged quantity, as 
defined in equation (46), it is interesting to consider products of terms 
(5k(5k'5k" obtained from an individual realisation. According to the fair sam- 
ple hypothesis discussed above we would hope appropriate averages of such 
quantities would yield an estimate of the bispectrum. Note that 



2.1 is 



(51) 



(1980). 



5k(^k'(^-k-k' = ^k^k.'K+k' = /?(k,k'). 



(53) 
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using the requirement (40) , together with the triangular constraint we dis- 
cussed above. Each /?(k, k') will carry its own phase, say ^k,k', which obeys 

0k,k' = ^k + 0k' - 0k+k'- (54) 

It is evident from this that it is possible to recover the complete set of phases 
0k from the bispectral phases 0k,k') up to a constant phase offset corre- 
sponding to a global translation of the entire structure (Chiang k. Coles 
2000). This furnishes a conceptually simple method of recovering missing 
or contaminated phase information in a consistent way, an idea which has 
been exploited, for example, in speckle interferometry (Lohmann, Weigelt 
k, Wirnitzer 1983). In the case of quadratic phase coupling, described by 
equation (35), the Icft-hand-sidc of equation (54) is identically zero leading 
to a particularly simple approach to this problem. 

6. Discussion 

In this lecture I addressed two main issues, using the quadratic model as an 
illustrative example. First I showed explicitly how this non-Gaussian model 
has properties that contradict standard folklore based on the assumption of 
Gaussian fluctuations. Wc used this model to distinguish carefully between 
various inter-related concepts such as sample homogeneity, statistical ho- 
mogeneity, asymptotic independence, ergodicity, and so on. I showed the 
conditions under which each of these is relevant and deployed the quadratic 
model for particular examples in which they are violated. I then used the 
quadratic model to show how phase association arises in non-linear pro- 
cesses which has exactly the correct form to generate non-zero bispectra 
and three-point covariance functions. The magnitude of these statistical 
descriptors is of course related to the magnitude of the Fourier modes, 
but the factor that determines whether they are zero or non-zero is the 
arrangement of the phases of these modes. 

The connection between polyspectra and phase information is an im- 
portant one and it opens up many lines of future research, such as how 
phase correlations relate to redshift distortion and bias. Also, I assumed 
throughout this study that we could straightforwardly take averages over a 
large spatial domain to be equal to ensemble averages. Using small volumes 
of course leads to sampling uncertainties which are quite straightforward 
to deal with in the case of the power-spectra but more problematic for 
higher-order spectra like the bispectrum. Understanding the fluctuations 
about ensemble averages in terms of phases could also lead to important 
insights. 
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